__author__ = 'Robert Nikutta <robert.nikutta@noirlab.edu>, Stéphanie Juneau <stephanie.juneau@noirlab.edu>, Knut Olsen <knut.olsen@noirlab.edu>, David Nidever <david.nidever@noirlab.edu>, Alice Jacques <alice.jacques@noirlab.edu>'
__version__ = '20230710' # yyymmdd; version stamp of this notebook
__datasets__ = ['smash_dr1', 'delve_dr2'] # datasets used in this notebook
__keywords__ = ['science example','dwarf galaxies'], # keywords relevant to this notebook, e.g. ['science case','dwarf galaxies']
Robert Nikutta, Stéphanie Juneau, Knut Olsen, David Nidever, Alice Jacques & Astro Data Lab Team
We will use a dwarf galaxy detection algorithm using a spatial overdensity of blue stars to re-discover the Hydra II dwarf galaxy in the SMASH DR1 dataset. Hydra II (red circle in the map below) is one of ~60 dwarf galaxy satellites of the Milky Way.
Background Ultrafaint dwarf galaxies are crucial to understanding many aspects of the universe. For instance, they are dominated by dark matter; their localization in space can thus trace the large-scale structure of the dark matter distribution. Furthermore, dwarf galaxies are suspected to host intermediate-mass black holes (IMBH), which so far have eluded efforts to find them. IMBHs will naturally bridge the gap between the solar-mass black hole and super-massive blackholes that reside at the center of virtually every large galaxy.
Data retrieval We will retrieve data from Field 169 of the SMASH catalog (Nidever et al. (2017, AJ, 154, 199)) and look for overdensities of blue objects.
The required columns are RA, Dec, and the g, r, i magnitudes.
Detection We will convolve the spatial distribution of our dataset with a pair of Gaussian kernels and subtract the results, as done in e.g. Stanford et al. (2005, ApJ, 634, 2, L129) (galaxy clusters), or Koposov et al. (2008, ApJ, 686, 279) (MW satellites). This has the effect of convolving the spatial distribution with a Mexican hat filter, which is useful for detecting objects at a desired spatial scale.
If you use this notebook for your published science, please acknowledge the following:
Data Lab concept paper: Fitzpatrick et al., "The NOAO Data Laboratory: a conceptual overview", SPIE, 9149, 2014, http://dx.doi.org/10.1117/12.2057445
Data Lab disclaimer: http://datalab.noirlab.edu/disclaimers.php
# std lib
from getpass import getpass
import warnings
warnings.filterwarnings('ignore') # to suppress some astropy depracation warnings
# 3rd party
import numpy as np
from astropy import utils, io, convolution, stats
from astropy.visualization import make_lupton_rgb
from photutils import find_peaks
from pyvo.dal import sia
import pylab as plt
%matplotlib inline
# Data Lab
from dl import authClient as ac, queryClient as qc
from dl.helpers.utils import convert
# plots default setup
plt.rcParams['font.size'] = 14
We will query the averaged photometry table from the SMASH catalog and select field #169. We'll issue two queries, one in which we select blue (-0.4 < g-r < 0.4) star-like objects (|sharp|<0.5), and the other without those constraints, to be used for comparison. For both queries, we will avoid photometry taken only with short exposures (depthflag>1), exclude objects with fewer than 4 detections, and apply a magnitude constraint of 9 < g < 25, so as to avoid excessive contamination.
field = 169 # SMASH field number to query
# Create a query of all sources in the Field, for comparison
query_all =\
"""SELECT ra,dec,gmag,rmag,imag
FROM smash_dr1.object
WHERE fieldid = '{:d}' AND
depthflag > 1 AND
ndetr > 3 AND ndetg > 3 AND
gmag BETWEEN 9 AND 25 AND random_id<10""".format(field)
# Create the query string; SQL keyword capitalized for clarity
# depth > 1 = no short exposures please
# ndetr, ndetg > 3 = more than 3 detections in r & g bands
# abs(sharp) < 0.5 = avoid broad objects
query =\
"""SELECT ra,dec,gmag,rmag,imag
FROM smash_dr1.object
WHERE fieldid = '{:d}' AND
depthflag > 1 AND
ndetr > 3 AND ndetg > 3 AND
abs(sharp) < 0.5 AND
gmag BETWEEN 9 AND 25 AND
(gmag-rmag) BETWEEN -0.4 AND 0.4""".format(field)
Running the query in synchroneous mode is very easy.
%%time
df_all = qc.query(sql=query_all,fmt='pandas') # response is by default a CSV-formatted string
print("Number of objects:", len(df_all))
print(df_all.head())
Number of objects: 117455
ra dec gmag rmag imag
0 184.876934 -32.868654 24.867119 24.545872 24.546236
1 184.892606 -32.874418 24.309147 25.094023 24.796555
2 184.894265 -32.873554 24.608845 24.403374 24.662037
3 184.895268 -32.872408 24.590645 24.494490 24.497213
4 184.896137 -32.871733 24.677746 24.908146 24.395891
CPU times: user 139 ms, sys: 61.6 ms, total: 200 ms
Wall time: 4.87 s
%%time
R = qc.query(sql=query,fmt='pandas') # response is by default a CSV-formatted string
print("Number of objects:", len(R))
print(R.head())
Number of objects: 104974
ra dec gmag rmag imag
0 184.876674 -32.873511 24.746605 24.838743 24.185682
1 184.876606 -32.870861 24.156397 24.068817 23.074945
2 184.875853 -32.867214 24.084047 24.028060 23.630045
3 184.877080 -32.869780 24.482061 24.446104 23.858896
4 184.878492 -32.866905 24.678942 24.714973 24.624266
CPU times: user 121 ms, sys: 35.6 ms, total: 157 ms
Wall time: 4.44 s
You might spot some overdensities already, particularly in the result with blue star-like objects.
fig, (ax1, ax2) = plt.subplots(1,2,figsize=(14,7), sharey=True, constrained_layout=True)
h1 = ax1.hexbin(df_all['ra'], df_all['dec'],gridsize=200)
ax1.set_xlabel('RA')
ax1.set_ylabel('Dec')
ax1.set_title('~100000 random objects')
h2 = ax2.hexbin(R['ra'], R['dec'],gridsize=200)
ax2.set_xlabel('RA')
ax2.set_title('~100000 blue stellar objects')
fig.colorbar(h1,label='number of objects per spatial bin');
Here we define the dwarf filter as a differential convolution of a two-dimensional image using two Gaussian kernels; this has the effect of convolution with a Mexican hat filter. The default kernel shapes look for objects on the scale of a few arcmin. The output includes a clipped array of the convolved spatial distribution, which we will use for peak detection.
def dwarf_filter (ra,dec,fwhm_small=2.0,fwhm_big=20):
"""Differential convolution with 2D Gaussian kernels.
Based on Koposov et al. (2008).
Code by Ken Mighell and Mike Fitzpatrick.
Minor edits by RN.
Parameters
----------
ra, dec : float or array
RA & Dec in degrees.
fwhm_small, fwhm_big : float
Full-width half maximum sizes of the small and big Gaussian kernels
to use in convolution, in arcminutes.
"""
x, y = ra, dec
print("Computing differential convolution .... ",)
# Information about declination (y) [degrees]
ymean = (y.min() + y.max()) / 2.0
ydiff_arcmin = (y.max() - y.min()) * 60.0 # convert from degrees to arcmin
# Information about right ascension (x) [degrees in time]:
xdiff = x.max() - x.min() # angular separation [degrees (time)]
xmean = (x.min() + x.max()) / 2.0
# convert from degrees in time to separation in angular degrees:
xdiff_angular = (x.max() - x.min()) * np.cos(ymean*(np.pi/180.0))
# convert from degress to arcmin
xdiff_angular_arcmin = xdiff_angular * 60.0
# Get the number of one-arcmin pixels in the X and Y directions:
nx = np.rint(xdiff_angular_arcmin).astype('int')
ny = np.rint(ydiff_arcmin).astype('int')
# Create a two-dimensional histogram of the raw counts:
Counts, xedges, yedges = np.histogram2d (x, y, (nx,ny) )
extent = [xedges[0], xedges[-1], yedges[0], yedges[-1]]
raw_hist = np.rot90(Counts).copy() # hack around Pythonic weirdness
# Make the small and big Gaussian kernels with a standard deviation
# of the given FWHM in arcmin^2 pixels.
kernel_small = convolution.Gaussian2DKernel(fwhm_small/2.35,factor=1)
kernel_big = convolution.Gaussian2DKernel(fwhm_big/2.35,factor=1)
# Compute the differential convolution kernels.
conv_big = convolution.convolve(raw_hist, kernel_big)
conv_small = convolution.convolve(raw_hist, kernel_small)
conv_delta = conv_small - conv_big
delta = conv_delta.copy()
# Compute statistics and the floor
mean = np.mean(delta, dtype='float64')
sigma = np.std(delta, dtype='float64')
sigmaRaw = np.std(raw_hist,dtype='float64')
median = np.median(delta) # not used
floor = mean
print('dwarf_filter: mean = {:g} sigma = {:g} sigmaRaw = {:g}'.format(mean, sigma, sigmaRaw))
clipped = delta.copy()
clipped[delta < floor] = floor
# Return the computed fields.
return raw_hist, extent, delta, clipped, sigma
We'll use the default convolution kernels of 2 and 20 arcminutes in size.
%time
small_k, big_k = 2., 20. # kernel sizes in arcminutes
raw, extent, delta, clipped, dsigma = dwarf_filter(R['ra'],R['dec'],fwhm_small=small_k,fwhm_big=big_k)
CPU times: user 1e+03 ns, sys: 0 ns, total: 1e+03 ns Wall time: 5.01 µs Computing differential convolution .... dwarf_filter: mean = 0.0890555 sigma = 1.79455 sigmaRaw = 5.33591
fig, ax = plt.subplots(figsize=(7,6))
im = plt.imshow(clipped)
plt.xlabel('pixel')
plt.ylabel('pixel')
plt.colorbar(label='relative spatial density after convolution');
Some peaks are visible, let's locate them automatically...
We'll use the photutils package to identify 10-sigma peaks in the clipped filtered image.
# find peaks
mean, median, std = stats.sigma_clipped_stats(clipped,sigma=3.0,maxiters=5)
tbl = find_peaks(clipped,median+10,box_size=small_k*2)
# add ra & dec positions of peaks found
a, b = extent[:2]
xvec = np.arange(a,b,(b-a)/clipped.shape[1])
a, b = extent[2:]
yvec = np.arange(a,b,(b-a)/clipped.shape[0])
tbl['ra'] = xvec[tbl['x_peak']]
tbl['dec'] = yvec[-tbl['y_peak']-1]
print(tbl)
x_peak y_peak peak_value ra dec
------ ------ ------------------ ------------------ -------------------
86 89 11.311524537935105 185.41055953270728 -31.976003447411664
34 100 11.512626647783904 184.39348041275326 -32.15954180336243
ecs = ['w','y'] # colors of box frames
ax.scatter(tbl['x_peak'],tbl['y_peak'],marker='s',s=tbl['peak_value']*40,c='none',edgecolors=ecs,lw=3) # keeps writing to previous ax
fig # repeats (the updated) figure
Data Lab comes with batteries included. Image cutout and download services are built in.
We'll just write two little functions:
# set up SIA
DEF_ACCESS_URL = "https://datalab.noirlab.edu/sia/coadd_all"
svc = sia.SIAService(DEF_ACCESS_URL)
# a little func to download the deepest stacked images
def download_deepest_images(ra,dec,fov=0.1,bands=list('gri')):
imgTable = svc.search((ra,dec), (fov/np.cos(dec*np.pi/180), fov), verbosity=2).to_table()
print("The full image list contains {:d} entries.".format(len(imgTable)))
sel0 = (imgTable['proctype'] == 'Stack') & (imgTable['prodtype'] == 'image') # basic selection
images = []
for band in bands:
print("Band {:s}: ".format(band)) #, end='')
sel = sel0 & (imgTable['obs_bandpass'] == band) # add 'band' to selection
Table = imgTable[sel] # select
row = Table[np.argmax(Table['exptime'].data.data.astype('float'))] # pick image with longest exposure time
url = row['access_url'] # get the download URL
print('downloading deepest stacked image...')
img = io.fits.getdata(utils.data.download_file(url,cache=True,show_progress=False,timeout=120)) # .decode() b/c in Python 3 url is of "byte" type and getdata() expects "string" type
images.append(img)
print("Downloaded {:d} images.".format(len(images)))
return images
# multi panel image plotter
def plot_images(images,titles=list('gri'),cmap=plt.cm.gray_r):
fig = plt.figure(figsize=(8,10),tight_layout=True)
grid = plt.GridSpec(4, 3)
ax1 = fig.add_subplot(grid[0, 0])
ax2 = fig.add_subplot(grid[0, 1])
ax3 = fig.add_subplot(grid[0, 2])
ax4 = fig.add_subplot(grid[1:, 0:])
for j,ax,img in zip(np.arange(4),[ax1,ax2,ax3,ax4],images):
ax.imshow(img,origin='lower',interpolation='none',cmap=cmap,norm=plt.mpl.colors.PowerNorm(0.1))
ax.axis('off')
ax.set_title('{:s}'.format(titles[j]))
Download the deepest stacked image cutouts (in 3 bands) around the position of the peak marked with a yellow box. Depending on network speed and system load, this can take a few seconds. Also create a 3-band false-color composite.
bands = list('gri')
idx = 1
print(tbl['ra'][idx], tbl['dec'][idx])
images = download_deepest_images(tbl['ra'][idx], tbl['dec'][idx], fov=0.1, bands=bands) # FOV in deg
184.39348041275326 -32.15954180336243 The full image list contains 401 entries. Band g: downloading deepest stacked image... Band r: downloading deepest stacked image... Band i: downloading deepest stacked image... Downloaded 3 images.
Plot the images, plus a false-color 3-band image:
images = [im-np.median(im) for im in images] # subtract median from all images for better scaling
images += [make_lupton_rgb(*images[::-1],stretch=30)] # add a 3-color composite image
plot_images(images,titles=bands+['3-band image'])
Looks like a galaxy cluster!
idx = 0
images = download_deepest_images(tbl['ra'][idx], tbl['dec'][idx], fov=0.1, bands=bands) # FOV in deg
images = [im-np.median(im) for im in images] # subtract median from all images for better scaling
images += [make_lupton_rgb(*images[::-1],stretch=30)] # add a 3-color composite image
plot_images(images,titles=bands+['3-band image'])
The full image list contains 403 entries. Band g: downloading deepest stacked image... Band r: downloading deepest stacked image... Band i: downloading deepest stacked image... Downloaded 3 images.
Definitely not a galaxy cluster, but a stellar overdensity
Get photometry of stars within a 5 arcmin radius of each density peak.
def makequery(ra0,dec0,radius0=5./60.,field=169):
query = """
SELECT ra,dec,gmag,rmag,imag FROM smash_dr1.object
WHERE fieldid = {:d}
AND depthflag > 1
AND abs(sharp) < 0.5
AND gmag BETWEEN 9 AND 25
AND q3c_radial_query(ra,dec,{:f},{:f},{:f})
""".format(field,ra0,dec0,radius0)
return query
query0 = makequery(tbl['ra'][0],tbl['dec'][0]) # center ra & dec
R0 = qc.query(sql=query0, fmt='pandas') # using sql argument instead of the default adql
print(R0.head()) # a Pandas method
ra dec gmag rmag imag 0 185.345295 -32.033874 24.745580 24.538908 24.439932 1 185.352840 -32.038977 24.814121 24.343794 23.955757 2 185.350117 -32.033695 24.803425 23.662357 23.163800 3 185.353101 -32.035724 23.915737 23.474472 23.258896 4 185.355643 -32.033433 24.231142 23.471200 23.154207
R0['g_r'] = R0['gmag'] - R0['rmag']
print( R0.head())
ra dec gmag rmag imag g_r 0 185.345295 -32.033874 24.745580 24.538908 24.439932 0.206672 1 185.352840 -32.038977 24.814121 24.343794 23.955757 0.470327 2 185.350117 -32.033695 24.803425 23.662357 23.163800 1.141068 3 185.353101 -32.035724 23.915737 23.474472 23.258896 0.441265 4 185.355643 -32.033433 24.231142 23.471200 23.154207 0.759942
query1 = makequery(tbl['ra'][1],tbl['dec'][1])
R1 = qc.query(sql=query1,fmt='pandas')
R1['g_r'] = R1['gmag'] - R1['rmag'] # compute color
print(R1.tail()) # another Pandas methodb
ra dec gmag rmag imag g_r 2170 184.471635 -32.119314 24.889196 24.793587 24.467518 0.095609 2171 184.481076 -32.122957 22.749125 21.331306 20.392641 1.417819 2172 184.470299 -32.112379 23.647750 22.085228 21.719313 1.562522 2173 184.470477 -32.112378 23.919018 22.351788 21.632957 1.567230 2174 184.467250 -32.106465 24.091139 23.340645 22.943100 0.750494
def plotpanel(axid,x,y,title='',xlim=(-1,2),ylim=(25.2,14)):
ax = fig.add_subplot(axid)
ax.scatter(x,y,marker='.',s=10, alpha=0.8)
ax.set_xlabel(x.name)
ax.set_ylabel(y.name)
ax.set_xlim(xlim)
ax.set_ylim(ylim)
ax.set_title(title)
fig = plt.figure(figsize=(12,5.5))
plotpanel(121,R0['g_r'],R0['gmag'],'white box, {:d} objects'.format(len(R0)))
plotpanel(122,R1['g_r'],R1['gmag'],'yellow box, {:d} objects'.format(len(R1)))

If you wish, you can now save your photometry table for Hydra II to a local file and take it with you.
outfile = 'hydra2.csv'
R0.to_csv(outfile,index=False)
hydra2.csv <-- Download the file as you please.
Find overdensities elsewhere (everywhere!)
optimize kernel sizes, computation, proper wavelets
Now we will run through the same process to re-discover the Centaurus I dwarf galaxy in the DELVE DR2 dataset (Mau et al. 2020).
Query the delve_dr2.objects table for the following columns:
And with the following constraints:
query = """
SELECT __, __, __, __, __
FROM __
WHERE q3c_radial_query(ra, dec, __, __, 0.5)
AND __ BETWEEN __ AND __
AND (__ - __) BETWEEN __ AND __"""
print(query)
SELECT __, __, __, __, __
FROM __
WHERE q3c_radial_query(ra, dec, __, __, 0.5)
AND __ BETWEEN __ AND __
AND (__ - __) BETWEEN __ AND __
Run the query in synchroneous mode and make the output format pandas.
%%time
df = __
print("Number of objects:", len(df))
print(df.head())
Number of objects: 2978
ra dec mag_auto_g mag_auto_r mag_auto_i
0 189.307640 -41.324229 22.742039 22.548280 22.021933
1 189.261247 -41.323464 21.527279 21.489342 21.411541
2 189.438062 -41.378271 21.589127 21.209211 21.158897
3 189.443497 -41.374310 21.653559 21.291859 21.216507
4 189.441928 -41.367302 23.944675 24.169296 24.210217
CPU times: user 29.4 ms, sys: 2.52 ms, total: 31.9 ms
Wall time: 838 ms
You might spot an overdensity already.
fig = plt.figure(figsize=(7,6))
plt.hexbin(__, __, gridsize=200)
plt.xlabel('RA (deg)')
plt.ylabel('Dec (deg)')
plt.colorbar(label='number of objects per spatial bin')
<matplotlib.colorbar.Colorbar at 0x7f12aca8cc40>
We'll use the default convolution kernels of 2 and 20 arcminutes in size.
%%time
small_k, big_k = __, __ # kernel sizes in arcminutes
raw, extent, delta, clipped, dsigma = dwarf_filter(__, __, fwhm_small=__, fwhm_big=__)
Computing differential convolution .... dwarf_filter: mean = 0.116196 sigma = 0.451999 sigmaRaw = 1.07089 CPU times: user 27.3 ms, sys: 947 µs, total: 28.2 ms Wall time: 27 ms
One dense peak will be visible.
fig, ax = plt.subplots(figsize=(7,6))
im = plt.imshow(__)
plt.xlabel('pixel')
plt.ylabel('pixel')
plt.colorbar(label='relative spatial density after convolution')
<matplotlib.colorbar.Colorbar at 0x7f12ac932650>
We'll use the photutils package to identify 3-sigma peaks in the clipped filtered image.
# find peaks
mean, median, std = stats.sigma_clipped_stats(__, sigma=3.0, maxiters=5)
print(f'Mean = {mean}, Median = {median}, Std = {std}')
tbl = find_peaks(__, median+3, box_size=small_k*2)
# add ra & dec positions of peaks found
a, b = extent[:2]
xvec = np.arange(a, b, (b-a)/clipped.shape[1])
a, b = extent[2:]
yvec = np.arange(a, b, (b-a)/clipped.shape[0])
tbl['ra'] = xvec[tbl['x_peak']]
tbl['dec'] = yvec[-tbl['y_peak']-1]
print(tbl)
Mean = 0.15711787023236085, Median = 0.11619622055078001, Std = 0.08005803782621249
x_peak y_peak peak_value ra dec
------ ------ ----------------- ----------------- ------------------
30 30 4.137728039557333 189.5845272303593 -40.90449882873472
ecs = ['w','y'] # colors of box frames
ax.scatter(__, __, marker='s', s=tbl['peak_value']*40,
c='none', edgecolors=ecs, lw=3) # keeps writing to previous ax
fig # repeats (the updated) figure
Fill in the blanks for the query_template below to select the following columns from the delve_dr2.objects table:
And with the following constraints:
radius0 of RA=ra0 and Dec=dec0def makequery(ra0, dec0, radius0):
query_template = """
SELECT __, __, __, __, __
FROM __
WHERE q3c_radial_query(ra, dec, {0}, {1}, {2})
AND __ BETWEEN __ AND __"""
query = query_template.format(ra0, dec0, radius0)
return query
Get photometry of stars within a 5 arcmin radius of the density peak.
query0 = makequery(__, __, 5./60.) # center ra & dec
R0 = __
print(R0.head()) # a pandas method
ra dec mag_auto_g mag_auto_r mag_auto_i 0 189.526334 -40.974901 23.137787 22.216522 22.119255 1 189.534045 -40.975292 23.097633 21.711887 21.064604 2 189.529183 -40.974057 24.786463 23.506096 23.684706 3 189.521272 -40.970934 21.012383 19.464642 18.525452 4 189.516410 -40.969993 19.250240 18.589230 18.379360
We can use the same plotpanel function as before.
R0['g_r'] = __
print(R0.head())
ra dec mag_auto_g mag_auto_r mag_auto_i g_r 0 189.526334 -40.974901 23.137787 22.216522 22.119255 0.921265 1 189.534045 -40.975292 23.097633 21.711887 21.064604 1.385746 2 189.529183 -40.974057 24.786463 23.506096 23.684706 1.280367 3 189.521272 -40.970934 21.012383 19.464642 18.525452 1.547741 4 189.516410 -40.969993 19.250240 18.589230 18.379360 0.661010
fig = plt.figure(figsize=(12,5.5))
plotpanel(121, __, __, 'white box, {:d} objects'.format(len(R0)))
Nidever et al. (2017) "SMASH - Survey of the MAgellanic Stellar History": http://adsabs.harvard.edu/abs/2017AJ....154..199N
Stanford et al. (2005, ApJ, 634, 2, L129) "An IR-selected Galaxy Cluster at z = 1.41": http://adsabs.harvard.edu/abs/2005ApJ...634L.129S
Koposov et al. (2008, ApJ, 686, 279) "The Luminosity Function of the Milky Way Satellites": http://adsabs.harvard.edu/abs/2008ApJ...686..279K
Martin et al. (2015, ApJ, 804, 5) "Hydra II: A Faint and Compact Milky Way Dwarf Galaxy Found in the Survey of the Magellanic Stellar History": http://adsabs.harvard.edu/abs/2015ApJ...804L...5M